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Abstract 

Multispecies microbial community in natural solid-state fermentation (SSF) is 
crucial for the formation of Chinese Pu-erh tea’s unique quality. However, the 
association between microbiota and tea quality are still poorly understood. Herein, 
shotgun metagenomic and metabolomic analysis showed that significant variations in 
composition of microbiota, collective functional genes, and flavour compounds 
occurred during SSF process. Furthermore, the formation pathways of the dominant 
flavours including theabrownin, methoxy-phenolic compound, alcohol and carvone 
were proposed. Moreover, biological interaction networks analysis among functional 
core microbiota, functional genes, and dominant flavours indicated Aspergillus was 
the main flavour-producing microorganism in the early SSF, while many other genera 
including Bacillus, Rasamsonia, Lichtheimia, Debaryomyces were determined as the 
functional core microorganism for flavours production in the late SSF. This study 
provides a perspective for bridging the gap between the microbiota and quality in 
Pu-erh tea, and benefited for further optimizing production efficiency and product 


quality. 
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1. Introduction 

As a well-known traditional Chinese beverage, Pu-erh tea does not only have its 
special sensory characteristic including mellow taste, stable flavor, and brownish-red 
color ’, but also has multiple health benefits, such as antiobesity, hypolipidemic, 
antitumor, free radical scavenging, and toxicity-suppressing activity ?. Different to the 
other kinds of tea, post-fermented Pu-erh tea is produced by a natural solid state 
fermentation (SSF) proces using sun-dried green tea leaves (Camellia sinensis var. 
assamica as the raw material /* Generally, the sun-dried green tea were moistened 
with water and piled as windrows in the fermentation room for a few weeks. To 
ensure the homogeneity, the tea leaves were turned over about once a week during the 
pile fermentation, and the fermentation process was stopped when the fermented tea 
mass was reddish-brown and free from the astringent taste ?. This natural SSF proces 
was generally thought to lead to a series of oxidation, condensation, and degradation 
reactions in the chemical constituents of the tea leaves, and finally to form the special 
quality and flavor characteristics of Pu-erh tea *. Previous studies have found that the 
contents of tea polyphenols, free amino acids, catechins, theaflavins and thearubigins 
decreased dramatically, while the contents of gallic acids and theabrownins (TB) 
increased significantly during Pu-erh tea SSF process *. The volatile compounds 
contributing Pu-erh tea’s unique aroma were mainly consisted of methoxyphenolic 
compounds, alcohols, hydrocarbons, aldehydes, ketones and esters ” 

Metabolic actions of diverse microorganisms including bacteria and fungi were 


thought to have key roles in the formation of the quality and flavor of Pu-erh tea a 
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The formation mechanism of the special quality of Pu-erh tea by total microorganisms 
was very important for further quality improvement but was poorly characterized. 
Previously, only a small proportion of microogranisms could be cultured from Pu-erh 
tea such as Aspergillus sp. and Blastobotrys sp., and the diversity of fungi and bacteria 
community in Pu-erh tea SSF process were just studied separately through 
culture-independent approaches such as denatured gradient gel electrophoresis ” °, 
clone library sequencing, and high-throughput amplicon sequencing based on 16S 
rDNA or 18S rDNA * * 7°. Recently, shotgun metagenomic study is a powerful 
methodology for deciphering the influence of the total microbial community on the 
flavor and aroma of fermentation foods “”’’, because it could not only provide the 
structure of total microbial community (including bacteria and fungi), but also 


13, 14 . 
**", Meanwhile, 


metabolic potential and functional profiles of microbial communities 
Pu-erh tea SSF is a dynamic process, a previous primary metagenomic study at only 
one fermentation time point for exploring the variation of the microbial taxonomy and 
functional genes in Pu-erh tea was insufficient 1 Thus, in this study, the distribution, 
metabolic potential of microbial community, and flavour compounds during Pu-erh 
tea SSF process were comparatively investigated by shotgun metagenomic and 
metabolomic approaches for the first time. Furthermore, metabolic network analysis 


was used to evaluate the complex relationship among the functional core 


microorganism, metabolism pathway and dominant flavor compounds in Pu-erh tea. 


2. Materials and methods 
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2.1. Pu-erh tea fermentation and sample collection 

Pu-erh tea samples undergoing the pile fermentation process were kindly provided 
by the Pu-erh tea production factory, Menghai-Courtyard Tea Trading Company 
(Yunnan, China) in 2013. Generally, the sun-dried green tea, used as the raw material, 
are moistened with water and piled as windrows in the fermentation room. To ensure 
the homogeneity, the tea leaves were turned over about once a week during the pile 
fermentation, and the fermentation process was stopped when the fermented tea mass 
was reddish-brown and free from the astringent taste. Pile fermentation were 
performed in triplicate. Tea samples collected from the piles at 0, 15, 29, and 45 days 
were immediately transported to the laboratory on dry ice and stored at —80 °C for 


further analysis. 


2.2. Comparative analysis of flavours during SSF process 

One gram tea leaves of each sample were soaked with 10 mL sterile distilled water 
for 30 min, and the pH of the suspension was measured using a glass pH electrode 
(FiveEasy FE20; Mettler-Toledo AG, Greifensee, Switzerland). The contents of tea 
polyphenols in tea liquor were determined by the spectraphotometric method based on 
FeSO, /° and the contents of theabrownin (TB) were analyzed using the 
spectrophotometry method /”. The volatile compounds were detected by headspace 
solid-phase microextraction and simultaneous distillation-extraction (HS-SPME) 
coupled with gas chromatography-mass spectrometry (GC-MS) *. Generally, 5 g 


grinded tea sample, 4.8 g NaCl, 20 mL distilled water were incubated in a 100 mL 
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sealed headspace vial (Sigma Chemical Co.) at 60 °C for 10 min, then the SPME fiber 
(65 um PDMS/DVB) was exposed in the headspace, placed for 60 min, and 
immediately inserted into the GC-MS injector port at 250 °C for 5 min. The volatile 
compounds obtained by SPME was performed by GC-MS using an Agilent 5975C 
mass selective detector coupled to an Agilent 7890A GC (Agilent, Santa Clara, CA). 
Helium was used as carrier gas with the flow rate of | mL/min. HP-5MS capillary 
column (30 m x 0.25 mm inner diameter, 0.25 um film thickness, Agilent) was used 
for GC separation. The desorption of analytes from the SPME fiber occurred was in 
splitless mode at 250 °C for 5 min. The empolyed temperature program was as 
follows: The temperature was 50 °C (held for 3 min) initially, it increased to 125 °C 
(held for 3 min) at a rate of 3 °C/min, then ramped to 180 °C (held for 3 min) at 
4 °C/min, and finally increased to 250 °C (held for 2 min) at 15 °C/min. The mass 
spectrometer was operated in an electron-impact mode of 70 eV. The temperatures of 
the interface, ion source, and quadrupole were 280, 230, and 150 °C, respectively. The 
mass scan range was 40-500 atomic mass units. Each compounds were identified 
using the NISTO8 mass spectra library, and further confirmed by comparing its mass 
spectra. The relative contents were used as the quantitative results obtained as 


described previously = 


2.3. Genomic DNA extraction, library construction and sequencing 
The microorganism cells in Pu-erh tea samples were collected as described 


previously '°, and the genomic DNA of the microorganisms was extracted using a 


133 


134 


135 


136 


137 


138 


139 


140 


141 


142 


143 


144 


145 


146 


147 


148 


149 


150 


151 


152 


153 


154 


protocol specific for high-molecular-weight environmental DNA '8 The metagenomic 
DNA libraries were constructed with 2 ug genome DNA according to the Illumina 
TruSeq DNA Sample Prep v2 Guide, with an average of 350 bp insert size. The 
quality of all libraries was evaluated using an Agilent bioanalyser with a DNA 
LabChip 1000 kit. Sequencing was performed using Illumina Genome Analyzer 
system according to the manufacture’s protocol at ABlife company (Wuhan, China). 
These sequence data were deposited in NCBI Sequence Read Archive (SRA, 


http://www.ncbi.nlm.nih.gov/Traces/sra) with accession number SUB3619204. 


2.4 Sequence assembly, gene prediction and pathway annotation 

After quality filtering as described previously /”, all shotgun metagenomic datasets 
were rarefied to the same sequencing depth by random resampling valid sequences for 
each sample before downstream analysis. The valid sequences were assembled using 
SOAPdenovo V 1.06 with the parameters -K 27 (k-mer size)-R-M3-d1 °°. Genes were 
predicted using MetaGeneMark with default parameters *’. Translated genes were 
annotated using DIAMOND software (V0.7.9) within NR database with E Sle® for 
species anaylsis. Translated genes were further annotated using DIAMOND software 
(V0.7.9) within the KEGG ” (Version 201609, http://www.kegg.jp/kegg/), eggnog a 
(Version4.5, http://eggnogdb.embl.de/#/app/home), CAZy 24 (Version 20150704, 


http://www.cazy.org/) databases for gene function analysis. 


2.5 Biological interaction network analysis of the dominant flavours, functional genes 
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and functional microorganism 

Collectively, methoxy-phenolic compound, theabrownin, alcohol and carvone were 
determined as the dominant flavors of Pu-erh tea, and the relative abundance of their 
functional microorganism were further analyzed. Take methoxy-phenolic compounds 
for example, methyltransferases were reported as the mainly functional gene 
(enzyme). To illuminate the constitute of methyltransferases producing 
microorganism, the gene ID number of methyltransferases, the responding relative 
abundance of methyltransferase, and the taxonomy annotation were firstly extracted 
from the eggNOG annotation. Then, the relative abundance of producing 
microorganisms at different taxonomy levels (kingdom or genus) at day 15 or day 30 
were recalculated using the total sum of relative abundance of all methyltransferases 
at day 15 or day 30 as 100%, respectively. 

Furthermore, the correlative flavor biosynthesis networks of Pu-erh tea dominant 
metabolites, functional genes and the top 10 producing genus were plotted using 
Cytoscape v. 2.8.143. The increase or decrease trend in the relative abundance of the 
related microorganisms between day 15 and day 30 were presented UP or DOWN in 


the “interaction” column as shown in Table SS. 


2.6 Data statistical analysis 
Data were expressed as mean + standard error of the mean (SEM). All data were 
subjected to analysis of variance using the GraphPad software (v 5.01, La Jolla, 


CA, USA). Differences between the means were tested by one-way ANOVA, and all 
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detected significant differences were further evaluated by Tukey’s post hoc test. Data 
were considered significantly correlated for p < 0.05 and very significantly correlated 
for p < 0.01. Principal component analysis (PCA) was performed using Canoco 


software (version 4.5, Biometris, Wageningen, Netherlands) . 


3. Results and Discussion 
3.1 Dynamics of flavour compounds during the Pu-erh tea SSF process 

To better investigate the dynamics of flavor compounds of Pu-erh tea, the tea leaves 
samples at day 0, 15, 30 and 45 were comparatively analyzed. As shown in Fig. 1A, 
the tea leaves became dark and softened, and the tea liquors turned to 
reddish-brownish gradually during the SSF process °° The initial pH of the tea liquors 
of raw material was 5.99, which firstly dramatically decreased to pH 5.11 at day 15, 
and then increased to pH 6.28 at day 30 and finally reached to pH 6.58 at day 45. 

As an predominant aroma factor, the volatile compounds of Pu-erh tea at 0, 15, 30 
and 45 day were detected by GC-MS analysis °, and the total ion current 
chromatographs of the volatile compounds of tea sample at day 30 and tea sample at 
day 45 were very similar, but which were significantly different to that of the raw 
material and tea sample at day 15 (Fig. 1A). Further analysis showed that there were 
85 volatile compounds detected in raw material, and the number of the volatile 
compound increased to 94, 95 and 92 at 15, 30 and 45 day, respectively. Compared to 


the raw material, three volatile compounds disappeared and 11 novel volatile 


compounds appeared in tea leaves at day 45. 
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All these volatile compounds could be divided into 9 categories including 14 
hydrocarbons, 16 alcohols, 10 ketones, 10 esters, 8 methoxy-phenolic compounds, 3 
aldehydes, 2 phenolic compounds, | nitrogen compound and | lactone compound, the 
average contents of these volatile compounds identified in different fermentation 
stage were comparatively presented in Table S4. As shown in Fig. 1B, alcohols and 
hydrocarbons were the main volatile compounds in the raw materials with the relative 
abundance of 38.27% and 31.69%, respectively. At day 15, alcohols was still the 
predominant volatile compounds with its content increasing to 54.99%, and 
methoxyphenolic compounds and phenols also started to increase to 8.49% and 0.38% 
(p<0.05). Although hydrocarbons was still the second predominant volatile 
compounds at day 15, its content dramatically declined to 18.87% (p<0.05). The other 
volatile compounds were all reduced or unchanged. At day 30, methoxyphenolic 
compounds was changed to be the predominant volatile compounds in Pu-erh tea, 
which continued to be largely produced with its content of 37.92%. While alcohols 
reduce to 31.70%, hydrocarbons declined to 15.77%, and lactones and phenols 
increased to 0.64% and 0.47%, respectively. At day 45, the main volatile compound 
was still methoxy-phenolic compounds (48.68%), followed by alcohols (21.00%) and 
hydrocarbons (14.88%) (Fig. 1B). Our results showed a large shift in the constitute of 
volatile compounds of Pu-erh tea during the SSF process, which was also 
corroborated by the separation of the volatile compounds in each fermentation period 
by principal components analysis (PCA) (Fig. 1C). Besides, most of the characteristic 


volatile compounds of Pu-erh tea like methoxy-phenolic compounds were mainly 
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formed in the late stage of SSF, which suggested the fermentation time is very critical 


for its flavor formation. 


3.2Metagenomic sequencing, gene assembly and prediction 

Previous analysis of the dynamics of flavour compounds showed that the tea quality 
and flavor composition of day 30 was quite similar to that of day 45, but was 
significantly different to that of day 15, which suggested the typical Pu-erh tea quality 
was basically formed in the first 30 days, and sample of day 30 was thought be much 
more appropriate than that of day 45 for analyzing the functional microorganisms and 
genes varations. Thus, to better understand the natural SSF process of the traditional 
Chinese Pu-erh tea, samples of day 15 and 30 were chosen for the following 
metagenomic assay, and each sample was performed in triplicate. A total of 570 million 
raw reads for all six tea samples (about 95 million per sample) were generated using 
Hiseq2000 system. After strict quality control, these clean reads of day 15 and day 30 
were assembled into a total of 12,579 and 87720 scaftigs with corresponding average 
length of 3,449 bp and 1113 bp, respectively (Table S1), respectively. After predicting 
ORFs by MetaGeneMark, a total of 203366 ORFs were found in the six samples with 
51.76% GC percent and 47.56% of these genes possessed a complete ORF (Table S2). 
The total length and average length of these genes were 114.81 Mbp and 564.25 bp 
(Table S2), respectively. As shown in Fig. 2A, the correlation analysis of gene 
abundance among these six samples showed a good consistency between intra-group, 


which suggested the good repeatability in Pu-erh tea fermentation. The number of 
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predicted genes increased significantly (P<0.05) from 131808 at day 15 to 196890 at 
day 30 (Fig. 2B), and 71582 novel genes (36.35%) were identified at day 30 (Fig. 2B), 
which indicated a large shift occurred in the gene constitution of Pu-erh tea during SSF 


process. 


3.2 Dynamics of the functional microbiota during Pu-erh tea SSF process 

The short fragments of protein families were well suited as phylogenetic markers 
for inferring the taxonomic affiliations of shortgun metagenomic sequences, it was 
also useful for analyzing the abundance of Eukaryota and Bacteria at the same 
statistical level, which is could not be revealed by previous microbiota studies using 
16S rDNA and 18S rDNA””. In this study, the relative abundance of Eukaryota was 
higher than that of Bacteria throughout the whole SSF process (Fig. 3A), suggesting 
Eukaryota play the predominant role in Pu-erh tea SSF process, although it was 
inconsistent to the previous Pu-erh metagenomic study which found that the relative 
abundance of Bacteria (76.26%) is much higher than that of Eukaryota(16.35%) at 
day 25 ”. The difference in microorganism community between these two 
metagenomic studies might be due to various raw materials, original microorganisms 
from the raw materials, and fermentation environment. However, there were 
something can’t be neglected, the relative abundance of Bacteria was notably 
increased in the late SSF stage. Earlier studies using 16S rDNA squences also found 
that the diversity of Bacteria increased with the Pu-erh tea fermentation process ae 


Taken together, the importance of Bacteria was implied especially in the late SSF 
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stage of Pu-erh tea. 

Similar to previous studies using 16S rDNA/18S rDNA _ sequences, the 
microorganisms community in Pu-erh tea mainly belonged to phylum Proteobacteria, 
Firmicutes, Actinobacteria, and Ascomycota 10.25 Our results further indicated that the 
microorganisms community in early SSF stage were dominated by Ascomycota 
(52.87%) and Proteobacteria (21.36%), the relative abundances of the other phylums 
were all below 2% (Fig. 3B). In the late SSF stage, the relative abundance of 
Ascomycota and Proteobacteria reduced to 37.12% and 16.21%, respectively, while 
the relative abundance of Firmicutes and Mucoromycota increased rapidly to 12.86% 
and 10.38% (Fig. 3B). 

Further analysis at the genus and species levels showed there were 485 genus and 
1144 species detected in the early SSF stage, and the numbers were largely increased 
to 719 and 2097 in the late SSF stage (Table $3). There were 81 genus and 1112 
species shared between these two SSF stages, it meaned that 66.90% of the total 
genus and 53.03% of the total species at 30 day fermentation were came from 15 day 
fermentation samples, which is similar to the previous study ”. Besides, it was worth 
noting that 238 genus and 985 and species were newly identified in the late SSF stage 
(Table S3), which took up 33.10% and 46.97% of the total genus and species, 
respectively. These newly appeared microorganisms probably came from Pu-erh tea 
fermentation environment ’, suggesting that the fermentation environment was critical 
for the different qualities formation of fermentation tea from various areas. Among 


these microorganisms, the predominant genus was Aspergillus (42.10%), followed by 
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Pseudomonas (19.40%) and Debaryomyces (4.78%) in the early fermentation stage 
(Fig. 3C). Aspergillus niger was the predominant strain with relative abundance of 
12.88% (Fig. 3D). While in the late stage, Aspergillus decreased significantly to 
6.51%, and Pseudomonas was even hardly detected. The relative abundance of 
many other genus 
including Achromobacter (8.39%), Debaryomyces (6.24%), Sugiyamaella (6.55%), R 

asamsonia (6.43%) and Lichtheimia (5.91%) increased siginificantly, 
especally Sugiyamaella lignohabitans (6.55%) and Rasamsonia_ emersonii (6.43%) 
were changed to be the predominant species (Fig. 3D). These above results clearly 
indicated that the significant variations in the microbial quantity, abundance, diversity 
and composition during Pu-erh tea SSF process (Fig. 3E). Among _ these 
microorganisms, only a few microorganisms like genus Aspergillus and 
Debrayomyces were previously found to be benefit for Pu-erh tea manufacture such as 
increasing TB and vitamin contents and reducing caffeine and tannis *” 7”. Notably, 
many functional core microorganisms such as Sugiyamaella, Rasamsonia, 
Lichtheimia, Achromobacter were newly found in Pu-erh tea for the first time, which 
provided novel strain resources for further tea processing, and their functions were 


explored later in this manuscript. 


3.3. Functional gene category by blasting to COG, CAZY, KEGG databases 
These genes were functionally annotated and matched using eggNOG, CAZy and 


KEGG databases (e-value <10-—5), respectively. All these genes were clustered into 24 


309 


310 


311 


312 


313 


314 


315 


316 


317 


318 


319 


320 


321 


322 


323 


324 


325 


326 


327 


328 


329 


330 


eggNOG categories, the top three categories in both two SSF stages were S (function 
unknown), E (amino acid transport and metabolism) and G (Carbohydrate transport 
and metabolism) (Fig. $1). Besides, about 25% of these genes belonged to category S, 
meaning that their functions are unknown and novel, even though we captured the 
DNA sequences and used de novo assembly software to predict the genes. 

According to CAZy database annotation, these genes in the early SSF stage were 
classified into six categories including 1.72% Glycoside Hydrolases, 0.92% 
GlycosylTransferases, 0.33% Carbohydrate-Binding Modules and 0.31% Auxiliary 
Activities. While in the late SSF stage, the relative abundances of 
Carbohydrate-Binding Modules and Carbohydrate Esterases increased to 0.37% and 
0.16%, while the relative abundances of Glycoside Hydrolases, GlycosylTransferases 
and Auxiliary Activities decreased to 1.55%, 0.88% and 0.18%, respectively. Besides, 
many degradation enzymes involved in the hydrolysis of plant cell wall were detected 
in both early and late SSF stages, including cellulose degradation enzymes (cellobiose 
dehydrogenase, glucanases and glucosidases) and hemicellulose degradation enzymes 
(arabinofuranosidases and xylosidases) (Fig. S2). Previous metaproteomic analysis of 
Pu-erh tea have detected many microbial extracellular enzymes in Pu-erh tea € and 
Wang et al. also showed that the surfaces of Pu-erh tea leaves were covered by 
microorganisms and the cells structures were largely disrupted after SSF process ’”. 
These results indicated that the microorganisms make a greater contribution on the tea 
leaf cell wall degradation, which might be helpful for further mellow taste production 


of Pu-erh tea. 
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These functional genes in both two SSF stages were divided into six categories 
Based on KEGG pathway database annotations. Among them, “metabolism pathway” 
had the highest relative abundance, which increased from 10.18% at day 15 to 13.38% 
at day 30. As shown in Fig. S3, further annotation analysis by KEGG database 
showed these metagenomic genes were related to many secondary metabolites 
pathways, especially carotenoid biosynthesis, geraniol degradation, limonene and 
pinene degradation, sesquiterpenoid and _ triterpenoid biosynthesis, caffeine 
metabolism, and flavone and flavonol biosynthesis, which was consistent to the 
previous metagenomic study of Pu-erh tea The combination of annotations by 
CAZy, eggNOG and KEGG databases indicated the clear differences in the microbial 


metabolic functions between the early and late fermentation stage of Pu-erh tea. 


3.5. Theabrownin production and its relative microorganisms 

Among the important parameters for evaluating the quality of Pu-erh tea, 
theabrownin (TB) is not only essential for the unique color and taste of Pu-erh tea, but 
was also attributed to some health-related claims of Pu-erh tea, such as relieving 
fatigue and lowering blood lipid level. In this study, TB was found to be produced in 
large quantities during the SSF process with its content increasing from 2.18% (day 0) 
to 20.73% (day 45) (Fig. 4B), which was consistent to the previous study * that TB is 
the main water soluble polymeric tea pigments of Pu-erh tea. TB was previously 
reported to be derived from phenolic compounds including tea polyphenols, 


theaflavins (TFs) and thearubigins (TRs) in tea leaves as shown in Fig. 4A. In this 
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study, the content of tea polyphenols was indeed found to decrease consistently from 
day 0 (23.12%) to day 45 (11.51%) (Fig. 4B). The formation of TB from tea 
polyphenol was thought to be mainly performed by catechol oxidase, peroxidase and 
laccase (Fig. 4A) an Among these three enzymes, no polyphenol oxidases were found 
in the Pu-erh tea microbiome, which was probably because catechol oxidase are 


mostly exited in plants, especially the tea tree genome ices! 


. While peroxidases and 
laccases abounded in the Pu-erh tea microbiome based on the CAZY database 
annotation, and the relative abundance of peroxidase was much higher than that of 
laccase as shown in (Fig. 4C). Besides, the relative abundance of peroxidases 
significantly increased, while the relative abundance of laccase declined during the 
SSF process (Fig. 4C). Previous studies have shown that the activity of catechol 
oxidase in green tea is passivated by the fixation step, which made the polyphenols 
free from oxidative degradation and resulted in the high tea polyphenols in green tea, 
and the tea polyphenols of black tea leaves could be transformed to TFs and TRs by 
catechol oxidase *”. Different to the green tea and black tea, although catechol oxidase 
in the raw material of Pu-erh tea was similarly destroyed by the fixation step, but the 
microbial producing peroxidases and laccases could continuously transformed tea 
polyphenols to TFs, TRs, and further to TBs, resulting that the content of TB in 
Pu-erh tea is much higher than that of the other kinds of tea ae 

In addition, the laccases and peroxidases producing microorganisms were 
comparatively analyzed. For laccase, 79.58% and 20.42% of laccases were originated 


from Eukaryota and Bacteria at 15 day, respectively. The predominant genus was 
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Aspergillus (70.36%), followed by Pseudomonas (15.90%) and Debaryomyces 
(3.76%) (Fig. 4D). At 30 day, the relative abundance of Eukaryota increased to 
88.21%, the predominant genus was changed to Rasamsonia (16.68%), followed by 
Aspergillus (15.11%), Lichtheimia (14.43%), Blastobotrys (10.38)%, Debaryomyces 
(6.6%) (Fig. 4D). While the relative abundance of Bacteria decreased to 11.79%, 
especially Pseudomonas decreased dramatically to 0.04%, but other genus belong to 
Bacteria were found, including Achromobacter, Bacillus, Oceanobacillus, Alcaligenes 
and Lactobacillus (Fig. 4D). For peroxidase, all of the genes came from Eukaryota 
during the SSF process. At day 15, Aspergillus was predominant genus with the 
relative abundance of 75.43%, followed by Debaryomyces (11.17%), Rosellinia 
(5.74%) and Geotrichum (5.59%) (Fig. 4E). But at day 30, the predominant genus was 
changed to Rosellinia (30.12%), followed by Geotrichum (23.94%), Debaryomyces 
(11.91%), Aspergillus (10.44%), Lichtheimia (6.56%) and Rasamsonia (5.61%) (Fig. 
4E). Besides, our results suggested that Aspergillus is the main TB producing strain in 
the early stage, and in the late stage of SSF, other strains especially Rasamsonia, 
Lichtheimia and Debaryomyces played the important role in TB production by 
producing peroxidases and laccases simultaneously (Fig. 8). Previous studies have 
reported that Aspergillus sp., Rhizomucor sp., and Candida sp. were previously 
reported for TB production ?7" and the present study might provided good and novel 


strain resources for producing TB. 


3.6 Methoxy-phenolic compounds production and its relative microorganisms 
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Methoxy-phenolic compounds is the most abundant aroma component in pu-erh tea, 
previous studies have found that Pu-erh tea contains the higher content of 
methoxy-phenolic compounds than green tea, black tea, Oolong tea and white tea ea 
These methoxy-phenolic compounds with stale smell appeared to be the special 
characteristic aroma components in Pu-erh tea, which could be used to discriminate 
Pu-erh tea from other kinds of teas. In this study, the relative abundance of 
methoxy-phenolic compounds was only 3.39% in the raw materials, which increased 
to 8.49% at 15 day, and then was largely produced and turned to be the predominant 
aroma components in Pu-erh tea at day 30 and day 45 with the relative abundance of 
37.92% and 48.68% (Fig. 1B), respectively. Among these methoxy-phenolic 
compounds, 1,2,3-trimethox ybenzene, 1,2,4-dimethox ybenzene and 
1,2-trimethoxybenzene with stale odor were always in top 10 VOC compounds during 
Pu-erh tea SSF process (Table 1). Besides, these three methoxy-phenolic compounds 
all increased during the SSF process, especially the relative abundance of 
1,2,3-trimethoxybenzene was only 1% in the raw materails, and then turned to be the 
predominant VOC in Pu-erh tea at day 45 (Fig. 5A). 

The exact formation pathway of methoxy-phenolic compounds in Pu-erh tea were 
not clear, which was generally considered to be products of the methylation of gallic 


acid by microbial enzymes and thermal degradation : 


. Methyltransferases in 
secondary metabolites biosynthesis, transport and catabolism (Q) which resoponded 


for methylation were found to have similar relative abundance of 0.064% and 0.069% 


at 15 day and 30 day (Fig. 5B). Further analysis showed that the predominant 
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methyltransferase producing strain at 15 day were Pseudomonas (49.34%) and 
Aspergillus (43.35%) (Fig. 5C). While Pseudomonas was hardly detected at day 30, 
and Achromobacter increased to be the predominant producing strain (17.31%), 
followed by Sugiyamaella (9.00%) and Aspergillus (7.33%) (Fig. 5C). Our results 
firstly confirmed the dynamics and high production of methoxy-phenolic compounds 
in the Pu-erh tea SSF process, and further analyzed the potential role of 
methyltransferases and its producing strain in the formation of methoxy-phenolic 
compounds, which provided a novel guidance for studying the formation pathway of 


methoxy-phenolic compounds in Pu-erh tea. 


3.7. Alcohols production and its relative microorganism 

As a important aroma component in tea, alcohols, have special scent including 
floral, sweet and wood, and have good coordinating effect on the flavor of Pu-erh tea , 
As shown in Fig. 1B, alcohol was predominant in the early SSF stage, its relative 
abundance increased from 38.27% at day 0 to 54.99% at day 15, and then 
dramatically declined to 31.7% and 21.00% at day 30 and day 45, respectively. 
Among these alcohol compounds, linalool, linalool oxide I, linalool oxide II and 
terpineol with flora, wood and clove smell were all listed in top 10 volatile 
compounds (Table 1), and the amount of these four alcohol compounds exhibited 
similar increase trends with the amount of total alcohols (Fig. 6A), suggesting they 
are the important aroma components of Pu-erh tea. 


B-primeverosides and f-glucopyranosides of linalool and linalool oxides in tea 
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leaves was thought as the main precursor for the biosynthsis of linalool and linalool 
oxides, which could be released by enzymatic hydrolysis with the corresponding 
degrading enzymes primeverosidase and glucosidases %° Further analysis based on 
CAZy database showed that primeverosidase and glucosidases were detected 
throughout the Pu-erh SSF process, and the relative abundance of these two enzymes 
both increased significantly with the SSF process (Fig. 6B). For primeverosidases, in 
the early stage, 93.80% of producing strains originated from Eukaryota and only 
3.84% of which came from Bacteria in the early stage, and the predominant genus 
was Aspergillus with relative abundance of 71.75%. While in the late SSF stage, only 
58.70% of primeverosidases originated from Eukaryota and 41.20% of which came 
from Bacteria. At the genus level, the relative abundance of Aspergillus reduced to 
only 5.49%, and the relative abundance of Bacillus, Sugiyamaella, Rasamsonia and 
Listeria increased to 12.69%, 9.80%, 8.46%, and 7.45% (Fig. 6C), respectively. 
Similarly, in the early stage, 90.33% of glucosidases came from Eukaryota and 8.75% 
of which belonged to Bacteria, the predominant glucosidases producing genus was 
also Aspergillus (68.87%). In the late SSF stage, the relative abundance of 
glucosidases producing Aspergillus reduced to only 7.06%, and Sugiyamaella was the 
main glucosidases producing strains (12.87%), followed by Rasamsonia (11.12%) and 
Bacillus (7.96%) (Fig. 6D). 

However, although the relative of primeverosidase and glucosidases both increased 
in the late stage, it's worth noting that the amount of total alcohols reduced 


dramatically in the late stage, previous study has also found the amount of alcohols 
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significantly reduced during the SSF process, which is another character of Pu-erh tea 
different to other kinds of tea*. The decrease in relative abundance of alcohols at late 
stage of Pu-erh SSF partly due to the high volatile of alcohols accompanied by turning 


over every week during the fermentation. 


3.8. Carvone production and its relative microorganism 

Ketones were only a small part of the aroma component in Pu-erh tea, the main 
ketone compounds in the raw materials were geranyl acetone (1.67%), B-ionone 
(3.53%), and a-ionone (0.77%). The relative abundance of total ketones and the main 
compounds including geranyl acetone, B-ionone, and a-ionone all decreased during 
the SSF process. While carvone was largely produced during SSF process as shown in 
(Fig. 7A) with its relative abundance increased from 0 to 0.32% at day 45, which was 
also observed by previous study *. Further analysis by KEGG annotation found that 
carvelo dehydrogenase (EC 1.1.1.275) responding for transforming carvelo to carvone 
significantly increased during the SSF process (Fig. 7B, 7C). The main carvelo 
dehydrogenase producing strain was Microbacterium sediminis, the amount of which 
was also increased significantly during the SSF process (Fig. 7D), these results 


clarified the carvone formation pathway during Pu-erh tea SSF process. 


3.9. Biological interaction networks of the dominant flavour compounds 
Biological interaction network analysis between the formation pathways of 


theabrownin, methoxy-phenolic compound, alcohol, and carvone indicated the 
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Aspergillus was the functional core microorganism in the early SSF process, which 
participated in all these dominant flavour compounds formation by corresponding 
functional genes (Fig. 8). While in the late SSF stage, the diversity and abundance of 
microbiota increased significantly, meanwhile many other genera especially Bacillus, 
Rasamsonia, Lichtheimia, Debaryomyces were determined to contribute more than 
one dominant flavour compounds (Fig. 8), suggesting they were the functional core 
microorganism for flavour production in the late stage. These complex metabolic 
networks clearly suggested the significant influence of the multispecies microbial 
community on the quality of Pu-erh tea, and will also provide a valuable reference for 


further genetic studies of Pu-erh tea microbiota. 


4. Conclusion 

In this study, the significant variations in microbial community, functional genes 
and characteristic flavor metabolites during Pu-erh tea pile fermetation process were 
comparatively studied by metagenomics and metabolomics approaches for the first 
time. Besides, a metabolic network for dominant flavor compounds formation in 
Pu-erh tea microbiota was constructed, genus Aspergillus was the main 
flavor-producing microorganism in the early fermetation, while many other genera 
including Bacillus, Rasamsonia, Lichtheimia, Debaryomyces were determined as the 
functional core microorganism for flavors production in the late fermetation. Our 
approach is helpful to elucidate the metabolites formation mechanisms in microbial 


community of Pu-erh tea and further improve the tea quality. 
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Figure Captions: 

Fig. 1 Dynamics of the Pu-erh tea leaves and the volatile compounds during solid 
state fermentation. (A) The changes of tea leaves and tea liquors at day 0, 15, 30 and 
45. (B) The relative abundance of the 9 categories volatile compounds during the 
solid state fermentation. (C) The principal components analysis of volatile compounds 


at day 0, 15, 30 and 45. 


Fig. 2 Correlation analysis (A) and Venn diagram (B) of the number of predicted 
gene among the six tea samples. Different colors represent spearman correlation 
coefficients. The relationship between correlation coefficient and color is shown in the 
illustration on the right. The left ellipse indicates the correlation coefficient is negative 
and the left ellipse indicates the correlation coefficient is positive. The flat of the 


ellipse indicates the absolute value of the correlation coefficient. 


Fig. 3 Dynamics of the functional microorganism during Pu-erh tea solid state 
fermentation at kingdom (A), phylum (B), genus (C), species level (D) and the 
cladogram of microorganism with significant differences between day 15 and day 
30 (E). Green nodes and red nodes represent the important microorganism at day 15 
and day 30, respectively, and microorganism with no significant differences are 


uniformly colored yellow. 


617 


618 


619 


620 


621 


622 


623 


624 


625 


626 


627 


628 


629 


630 


631 


632 


633 


634 


635 


636 


637 


638 


Fig. 4 Theabrown (TB) production during Pu-erh tea solid state fermentation by 
microorganisms and its relative enzymes. (A) Previous reported pathway of TB 
production. 1, catechol oxidase; 2, peroxidase; 3, laccase. Dynamic changes of TB 
and tea polyphenol (B), laccase and peroxidase (C), the main laccase producing 
microbes at genus level (D), the main peroxidase producing microorganisms (E) 
during solid state fermentation. Data are given as the mean+SEM, * represents p < 


0.05, ** represents p < 0.01. 


Fig. 5 Methoxy-phenolic compounds production during the solid state 
fermentation of Pu-erh tea by microorganisms and the relative enzymes. 
Dynamic changes of three main methoxy-phenolic compounds (A), 
methyltransferases in “secondary metabolites biosynthesis, transport and catabolism” 
(B), methyltransferases producing microorganisms at kingdom (C) and genus (D) 
level during the SSF. Data are given as the meant+SEM, * represents p < 0.05, ** 


represents p < 0.01. 


Fig. 6 Alcohols production during Pu-erh tea solid state fermentation by 
microorganisms and the relative enzymes. Dynamic changes of the main alcohols 
(A), perimeverosidese and glucosidase (B), the main perimeverosidese (C) and 
glucosidase (D) producing microorganisms at genus level during solid state 


fermentation. Data are given as the mean+SEM, * represents p < 0.05, ** represents p 
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< 0.01. 


Fig.7 Carvone production during Pu-erh tea solid state fermentation by 
microorganisms and the relative enzymes. Dynamic changes of carvone (A), 
pathway of carvone by KEGG annotation (B), carvelo dehydrogenase (EC 1.1.1.275) 
(C), and carvelo dehydrogenase producing microorganisms at the genus level (D) 
during solid state fermentation. Data are given as the mean+SEM, * represents p < 


0.05, ** represents p < 0.01. 


Fig. 8 Correlation between the characteristic metabolites including theabrown, 
methoxy-phenolic compounds, alcohols, and carvone, their producing 
microorganisms, and relative enzymes during Pu-erh tea solid state fermentation. 
Green nodes represent the characteristic metabolites, purple nodes represent the 
producing microorganism, yellow nodes represent the relative enzymes. Node size 

was made proportional to the number of significant correlations. Edge color indicates 
decrease trend (blue) or increase trend (pink) of the relative microorganisms during 
solid state fermentation. Black edge indicates the enzymes is related to the formation 


of characteristic metabolites. 


Supporting Information 
Table S1 Statistical information of the assembled scaftigs from each samples. 


Table S2 Statistical information of gene catalogue of pu-erh tea metagenomic in 
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this study. 

Table S3 The number of genus and species between the Pu-erh tea at day15 and 
day 30. 

Table S4 The average contents of the volatile compounds identified in different 


Pu-erh tea fermentation process. 
Table S5 The raw data imported into Cytoscape for biosynthesis networks analysis 


Fig. S1 The annotation and matched of genes among six tea samples using the 
eggNOG database. 

Fig. S2 The heatmap of top 35 genes among six tea samples functionally 
annotated and matched using the CAZy database. 1: Day15-1; 2: Day15-2; 3: 
Day15-3; 4: Day30-1; 5: Day30-2; 6: Day30-3 

Fig. S3 Predicted metabolic profiles of Pu-erh tea metagenomic data in KEGG 


database at second level. 


Table 1 The top 10 volatile compounds of Pu-erh tea during the solid state 
fermentation 


Relative aboundance (%) 


No. Compounds Odor 
DayO Day15 Day30 Day45 
1 1,2,3-Trimethoxybenzene Stale 1.00 3.43 26.92 35,65 
2 1,2,4-Trimethoxybenzene Stale ND 1.62 6.36 7.78 
3 Linalool oxide II Floral 4.01 11.17 8.60 6.10 
4 Trimethylsilyl acetic ester = - 7.74 4.08 4.65 5.73 
5 Trimethyl-Silanol - 4.63 7.03 7.93 4.60 
6 Linalool oxide I Floral 2.54 6.65 4.46 3.85 
7 1,2-Dimethoxybenzene Stale 0.69 1.72 3.20 3.82 
9 1,1,1,5,5,5-Hexamethyl-3,3- : 4.49 2.30 2.91 2.74 
Trimethylsilyltrisiloxane 

9 Terpineol Wood, 1.90 4.07 3.01 ZAS 

clove 
10 Linalool Floral 15.04 18.95 4.59 Le? 


oe 99 


means odor information is not clear. 
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Fig. 1 Dynamics of the Pu-erh tea leaves and the volatile compounds during solid 
state fermentation. (A) The changes of tea leaves and tea liquors at day 0, 15, 30 and 
45. (B) The relative abundance of the 9 categories volatile compounds during the 
solid state fermentation. (C) The principal components analysis of volatile compounds 


at day 0, 15, 30 and 45. 
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Fig. 2 Correlation analysis (A) and Venn diagram (B) of the number of predicted 
gene among the six tea samples. Different colors represent spearman correlation 
coefficients. The relationship between correlation coefficient and color is shown in the 
illustration on the right. The left ellipse indicates the correlation coefficient is negative 
and the left ellipse indicates the correlation coefficient is positive. The flat of the 


ellipse indicates the absolute value of the correlation coefficient. 
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Fig.3 Dynamics of the functional microorganism during Pu-erh tea solid state 
fermentation at kingdom (A), phylum (B), genus (C), species level (D) and the 
cladogram of microorganism with significant differences between day 15 and day 30 (E). 
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Fig. 4 Theabrown (TB) production during Pu-erh tea solid state fermentation by 
microorganisms and its relative enzymes. (A) Previous reported pathway of TB 
production. 1, catechol oxidase; 2, peroxidase; 3, laccase. Dynamic changes of TB 
and tea polyphenol, (B) laccase and peroxidase, (C) the main laccase producing 
microbes at genus level, (D) the main peroxidase producing microorganisms, (E) 


during solid state fermentation. 
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Fig. 5 Methoxy-phenolic compounds production during the solid state 
fermentation of Pu-erh tea by microorganisms and the relative enzymes. 
Dynamic changes of three main  methoxy-phenolic compounds (A), 
methyltransferases in “secondary metabolites biosynthesis, transport and catabolism” 
(B), methyltransferases producing microorganisms at kingdom (C) and genus (D) 
level during the SSF. Data are given as the mean+SEM, * represents p < 0.05, ** 


represents p < 0.01. 
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Fig. 6 Alcohols production during Pu-erh tea solid state fermentation by 
microorganisms and the relative enzymes. Dynamic changes of the main alcohols 
(A), perimeverosidese and glucosidase (B), the main perimeverosidese (C) and 
glucosidase (D) producing microorganisms at genus level during solid state 
fermentation. Data are given as the mean+SEM, * represents p < 0.05, ** represents p 


< 0.01. 
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Fig.7 Carvone production during Pu-erh tea solid state fermentation by 
microorganisms and the relative enzymes. Dynamic changes of carvone (A), 
pathway of carvone by KEGG annotation (B), carvelo dehydrogenase (EC 1.1.1.275) 
(C), and carvelo dehydrogenase producing microorganisms at the genus level (D) 
during solid state fermentation. Data are given as the mean+SEM, * represents p < 


0.05, ** represents p < 0.01. 
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Fig. 8 Correlation between the characteristic metabolites including theabrown, 
methoxy-phenolic compounds, alcohols, and carvone, their producing microorganisms, 
and relative enzymes during Pu-erh tea solid state fermentation. Green nodes represent 
the characteristic metabolites, purple nodes represent the producing microorganism, 

yellow nodes represent the relative enzymes. Node size was made proportional to the 
number of significant correlations. Edge color indicates decrease trend (blue) or 

increase trend (pink) of the relative microorganisms during solid state fermentation. 

Black edge indicates the enzymes is related to the formation of characteristic 

metabolites. 


Highlights 
1. The significant variations in composition of microbiota, functional genes and 
characteristic metabolites during Pu-erh tea fermetation process were simultaneously 


studied by metagenomics and metabolomics approaches for the first time. 


2. The biosynthesis or metabolic pathways of the dominant flavour compounds 
especially theabrownins, methoxy-phenolic compounds, alcohols, and carvones in 


Pu-erh tea were proposed. 


3. Aspergillus was the main flavour-producing microorganism in the early fermetation, 
while many other genera including Bacillus, Rasamsonia, Lichtheimia, Debaryomyces 
were determined as the functional core microorganism for flavours production in the 


late fermetation. 


